Exploiting Loop-Level Parallelism for SIMD Arrays Using OpenMP

نویسندگان

Con Bradley

Benedict R. Gaster

چکیده

Programming SIMD arrays in languages such as C or FORTRAN is difficult and although work on automatic parallelizing programs has achieved much, it is far from satisfactory. In particular, almost all ‘fully’ automatic parallelizing compilers place fundamental restrictions on the input language. Alternatively OpenMP provides an approach to parallel programming that supports incremental improvements to applications that places restrictions in the context of preserving semantics of a parallelized section. OpenMP is limited to a thread based model that does not define a mapping for other parallel programming models. In this paper, we describe an alternative approach to programming SIMD machines using an extended subset of OpenMP (OpenMP SIMD), allowing us to model naturally the programming of arbitrary sized SIMD arrays while retaining the semantic completeness of both OpenMP and the parent languages. Using the CSX architecture we show how OpenMP SIMD can be implemented and discuss future extensions to both the language and SIMD architectures to better support explicit parallel programming.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Explicit Vector Programming with OpenMP 4.0 SIMD Extensions

Modern CPU and GPU processors with on-die integration of SIMD execution units for achieving higher performance and power efficiency have posed challenges to use the underlying SIMD hardware (or VPUs, Vector Processing Unit) effectively. Wide vector registers and SIMD instructions –Single Instructions operating on Multiple Data elements packed in wide registers such as AltiVec [2], SSE, AVX[10] ...

متن کامل

SIMD parallel MCMC sampling with applications for big-data Bayesian analytics

We present a single-chain parallelization strategy for Gibbs sampling of probabilistic Directed Acyclic Graphs, where contributions from child nodes to the conditional posterior distribution of a given node are calculated concurrently. For statistical models with many independent observations, such parallelism takes a Single-Instruction-Multiple-Data form, and can therefore be efficiently imple...

متن کامل

Overlapping communication and computation with OpenMP and MPI

Machines comprised of a distributed collection of shared memory or SMP nodes are becoming common for parallel computing. OpenMP can be combined with MPI on many such machines. Motivations for combing OpenMP and MPI are discussed. While OpenMP is typically used for exploiting loop-level parallelism it can also be used to enable coarse grain parallelism, potentially leading to less overhead. We s...

متن کامل

Strategies for the efficient exploitation of loop-level parallelism in Java

This paper analyzes the overheads incurred in the exploitation of loop-level parallelism using Java Threads and proposes some code transformations that minimize them. The transformations avoid the intensive use of Java Threads and reduce the number of classes used to specify the parallelism in the application (which reduces the time for class loading). The use of such transformations results in...

متن کامل

Optimization Strategies for WRF Single-Moment 6-Class Microphysics Scheme (WSM6) on Intel Microarchitectures

Optimizations in the petascale era require modifications of existing codes to take advantage of new architectures with large core counts and SIMD vector units. This paper examines high-level and low-level optimization strategies for numerical weather prediction (NWP) codes. These strategies employ thread-local structures of arrays (SOA) and an OpenMP directive such as OMP SIMD. These optimizati...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Exploiting Loop-Level Parallelism for SIMD Arrays Using OpenMP

نویسندگان

چکیده

منابع مشابه

Explicit Vector Programming with OpenMP 4.0 SIMD Extensions

SIMD parallel MCMC sampling with applications for big-data Bayesian analytics

Overlapping communication and computation with OpenMP and MPI

Strategies for the efficient exploitation of loop-level parallelism in Java

Optimization Strategies for WRF Single-Moment 6-Class Microphysics Scheme (WSM6) on Intel Microarchitectures

عنوان ژورنال:

اشتراک گذاری